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EVJiLUATniGr SCIENCE TEAC}^XR PREPARATION PROGRAMS 
' BY ASSESSHTG TEACHERS *A!ID PUPILS' 



The point of preservice teacher education is to produce 
a social advantage or ^ savings.... 

. . »pre-service progrars should reduce the cost of on-the- 
job training.... 

...should lead to greater teacher productivity of^ desir- 
able pupil outcomes than the alterr^tive. (Turner, 1971, 
•p. IC) ' 

The instructicnal Techniques and -naterials e.TOloyed in science teacher 
preparation activities usually have ^he expressed intent of modifying the 
teachers' behavior. The inference then is that w/hen these behaviors are used 
in the science classroo&t they vill inprove pupil perfomance. 

An early stage in the teacher training process can' be identified as 
the conpgl;encv acGuisition phase . In This €tage the teacher is able to 
recognize a; cocpetency and begins to internalize it into his own^cognitive 
structure which later will influence his teaching behavior. 

A later stage is the iskill application phase, where a teacher exhibits 
an overt behavior which is largely influenced by the level of domperency 
acquisition. For'e:^inple, teachers can study and apply* the various levels 
of the Teaching Strategies ^Observation Differential (a classroom observation 
system) until they can recognize and categorize different teaching styles and 
coir!Sunicate within the language of the system. Later, they can manipulate 
their own teaching str^egies in relation to the teaching/learning environment 
^nd through skill application f^xhi^it a higher level of competency .acquisition. 

Historically, teacher training programs have been judged on the basis 
of the acquisition of knowledge with .limited demonstration of skill exhibition. 
Only recently has the use of pupil outcomes been promoted as 'measures of 



training prograt> effectiveness (e.g., Dunkin and 5iddle, 197*;; Popham, 1971). 
McNeil and Popham (1973) advocated that the iiltirare criterion of teachers' 
coTspetence be their ^inroact upon learners. This is also the position taken 
by Oicey (1977) in a> conpanion paper to this report. 

Science teacher program personnel need to concern theaselves at all 
levels skill acquisition and application and witli resultant pupil changes* 
The purpose "of this paper is to pronote this' awareness and offer a mechanism 
for Cf'irrying our progran research ar.d evaluation within the above franework. 

The Assessner^ of Skill A cquisition 

Pesearch and. evaluation Tiethods related to skill acquisition revolve 
around three generic stages of activities^ 1) the pre-treatr^ent or pre- 
instruction assessraent of the skill and related variables, 2) .involvement 
in instructional activixies designed to enhance the competency, and 3) post- 

w 

assessment of skill acquisition or improvement and related variables (see 
Fig^e 1). 

Collecting pre- instruct ion data can serve different purposes. First, 
the inform^ion may be used, in the absence of a control or comparison group, 
as base line data to deternine the effects of training.. Second, so.me pre- 
data nay be useful as predictox^.^ of success in relation to such variables as 
personality type or readiness level. Such information can be applied in a 
prescriptive format w>^f.re instructional resourced are matched to the situation 
of highest probable impact. 

During research and evaluation efforts on skill acquisition, the 
iijistructional format may be varied more than normal in order to provide an 
opportxinit^ to compare training modes. , For example, the*coraparison might be 
beti^en .a self-paced nodular format and an instructor- centered lecture/ 
discussion format related to the Mme skill, or among self study, peer 
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discussion foraat related to the sa-^e' skill, or ainong self 'study, peer * 
participation or field expet*^?nce with pupils as a nieans of acquiring a 
teaching skill / . ^ * 

Post-assessneiTt of sK^H acquisition may encompass three types of 
teacher measures: 1) cognitive, 2) affective^ and 3) skill exhibition. 
Cognitive performance and attitude data can normally be collected through 
conventional -pencil and paper TCasiires . But» in -many instances skill 
exhibition data must be collated through Visual or auditory observation and 
coded with the aid of an analysis scheme^ to'- facilitate commxjtnications and'* 
comparisons (Yeany, 1977 and '^^ any and Capie, 1977). » 

. Skill Application 

Because the modification of pupil behavior is a tenninal goal of teach- 
ing, one measure of teaching effectiveness is made through an assessment of 
^the change brought about in pupils during the skill application phase. 
Training irrocrram evaluation should end with an assessmVnt and judgment 
related to the level of skj.ll acqu^gition without determining the degree to 
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•which teachers are able to ap^ly a competency and modify pupil behavior. 

The Skill Application phase of training research and evaluation in- 
volves three relatoQ prpceduJ"^^- 1) the collection Qf pre-application pupil 
data, 2) the application. of the skin an ^ instruction setting and, 3) post- 
'application data collection to assess cognitive gains' and the attitudes of 
pupils toward the content and Jnethods of instruction (see Figure 2). The 
acid-test of skill application is whether a desired change in pupil behavior ^ 
can be brought about when th^ conditions are right and instruction is .designed 
and applied to do so, in other woj>^g, is the teaching effective? ^ 



To nsake defiisions about training program effectiveness based on pupil 
perforaiance is to base decisions on. data twice removed -from the source of 
influence (i.e., the level of skill' acquisition and application both act 
as filters). But, the ultimate influence of training programs is meant to 
be .exerted on pijpil performance through the trainees. 

There are at least t:-:o vays that skill application evaluation can be. 
carried oxjt. A fairly recent model has been developed by Popham, Baker, 
MillMn and McNeil (1972). These authors describe the teag^hing perfonnance 
test (minilesson) as a means of analyzing and assessing teaching effectiveness. 
In this procedure, the teacher is given an explicit, measurable objective 
suitable for a short lesson. ' A sample test item is provided to clarify the 
objecrivd, and the teacher prepares and teaches a lesson designed to accomplish 
the 'objective. At the end of the lesson a cognitive anfi affective post-test 
is administered to the pupils- On the basis of these, a judp;ment teaching 
effectiveness can be made. And, on the basis of such data from all program 
trainees, a judgment can be made about progr^ effectiveness. If random 
sQts of trainees have- experienced different treatments, the pupil performance 
data can b^ statistifc^'lly analysed to measure relative efficacy of training 
methods designed to bring about teacher behaviors that influence 'pupil perfor- 
mance. „' . 

The ninilesson approach as suggested by Popham, et al. (1972) is an 
appealing, and viable means <5f evaluating teacher preparation programs and 
has been used for such purposes (e.g., Rezba, Lahnston and Lapp, 1976). But, 
the skill application phase of program' evaluation should not be limited 1:o 
this procedure. An attempt should be made to assess effects of training on 
the acquisition, of a ffeihing skill in a less contrived and restricted con^ 
text. Also, the selection of the objectives should be the perogative of the 



teacher and should be penaitted to vary with the iinX<iiate instructional 
content in order to increase the generalizability across a sizable set of 
objectives.* On the surface, rhis nay appear to be a bold suggestion. But, 
when the dependent variable is pupil performance as influenced by teaching ^ 
toward one extenoally imposed objedtive, the judgnient about training program 
effectiveness must be^very renative, to say the least. The only criterion 
which should be inKJOsed on the selection of an objective is whether teaching 
toward it will- provide an- opportunity to apply the teaching comoetency which 
is being Assessed. 

If absolute judgirents about the teaching effectiveness of each individual 
in the program are to be made, the ^bove suggestion presents few problems. 
Simply, decide .what kinds of post-test scores arc acceptable and base the 
decisio'n on that point.' On the other hand, if relative coinparisons ar^ to be 
made (e.g., comparing the -training modes of self study? peer practice and 
field based practice), the dependent variable of pupil performance needs to 
be redefined. ' ' • 

The concept of using standardized gain scores is offered here as a 
means of comparing pupil outcomes, when teachers are teaching tonard different ' 
sets of objectives. This procedure necessitates 'the aaministration of 
identical or parallel pre-tests and post-tests for. each set of objectives. 
The. objectives are used to provide, a focus ror the development of the tests. 

To carry gut the analyses, the^ dependent variable is redefined as the 
amount of change in average pupil performance from pre to post-test in coded 
standard deviation units. Each pre-test is used as normative data for the ^ 
post-test. The raw pre-test scores are standardized to T-scores with a mean 
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of 50 and a standard deviation of 10 where T = lOz + 50 and z = (Glass 

s 

and Stanley 1970). The raw class mean score on the post-test is then converted 

/ 

8 , ' 



to a staixiard score by finding the difference between it and the raw pre-test 
mean and dividing that ',value by the standard deviation of the pre-test and 
then converting ,to 'a T-score as follows: 

XX 
rr pre ' - post 
- Z = — ^ — r- , and T = lOz + 50. 

%osr . ' 

It should be evident, that the generation of the T pre-score is not computa- 

tionally necessary because the mean of these scores will always^ equal 50 with 

s - 10. But, it may be a n^essary conceptual step to realize "that the 

difference between 50 aiid the T • - score represents the gain in pupil 

post * . 

achievement in one-tenth standard deviation units (i^e., an average T -scor 

/ . ^ post 

of 60 represents one full standard deviation gain in class achievement. It 
should also be evident that the z- score could be used but negative gain scores 
would be a problem. , ' 

The average class T^^^ -scores are then treated as the dependent 
variables with conventional statistical analysis procedures to determine 
significant differences in pupil .outcomes which can be attributed to variation 
in training program format. If the research/evaluation design being employed 
makes us'e of randomization acres? comparison groups* the threat to internal 
validity inherent in the use of a pre-test jwill not be present. If the threat 
is present, it is suggested that the pre-test be administered- to a random 
one-half of the pupils for the normative purposes and the remaining one-half 
of the post-tests be used to assess achievement gain. 

A comprehensive -evaluation of training materials and activities should 
include assessment in tx5th competency acquisition and application (Figure 3). ^ 
A failure to do both (for some^'skills) demonstrates an incongruency between 
terminal goals (i.e., the modifx-cation of pupil behavior) and the evaluation 



scheme; or,' it" denies that the real goal in teacher training programs is to 
influence teaching and. learning in ^he* science classroom. Also, the setting 
^ aad selection of content and resultant outcomes should be free to vary*in 
order to allow greater generdlizabillty of" the finding^. To. do so will reduce 
the tentative natui?e ^of "our training program decisions, and perhaps, produce 
a social advantage and reduce the "cost" of on-the-job training. 
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